{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Search and Query Knowledge Graphs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\">\n",
    "    <ul class=\"toc-item\">\n",
    "        <li><span><a href=\"#Search\" data-toc-modified-id=\"Search-1\">Search</a></span></li>\n",
    "        <li><span><a href=\"#Query\" data-toc-modified-id=\"Query-2\">Query</a></span></li>\n",
    "        <li><span><a href=\"#Query-Streaming\" data-toc-modified-id=\"Query-Streaming-3\">Query Streaming</a></span></li>\n",
    "        <li><span><a href=\"#Using-Results\" data-toc-modified-id=\"Using-Results-4\">Using Results</a></span></li>\n",
    "    </ul>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are a couple different ways to access the data of the knowledge graph. [`search`](https://developers.arcgis.com/python/latest/api-reference/arcgis.graph.html#arcgis.graph.KnowledgeGraph.search) provides results based on a full-text search string. [`query_streaming`](https://developers.arcgis.com/python/latest/api-reference/arcgis.graph.html#arcgis.graph.KnowledgeGraph.query_streaming) provides results based on an openCypher query string. The goal of this guide is to provide examples of various searches and queries that could be performed on a knowledge graph.\n",
    "\n",
    "## Search\n",
    "\n",
    "Searching the knowledge graph reaches out to the [search endpoint](https://developers.arcgis.com/rest/services-reference/enterprise/kgs-graph-search.htm) which searches for the given string on any properties of both entities or relationships by default. To search for properties on only entities or only relationships you can use the `category` parameter on `search` and set the value to either `entities`, `relationships`, or `both` (default)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "knowledge_graph.search(\"Esri\", category=\"entities\", as_dict=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These results are a generator that can be accessed as a list such as:\n",
    "```\n",
    "[\n",
    "    [\n",
    "        Entity(\n",
    "            properties={\n",
    "                'employee_count': 5000, \n",
    "                'product_count': 80, \n",
    "                'name': 'Esri', \n",
    "                'globalid': UUID('24bf1bdd-a17b-4140-aea8-fbc5701be7ea'), \n",
    "                'established_on': 3600000, 'objectid': 1\n",
    "            }, \n",
    "            type_name='Company', \n",
    "            id=UUID('24bf1bdd-a17b-4140-aea8-fbc5701be7ea')\n",
    "        )\n",
    "    ]\n",
    "]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Search uses Lucene syntax, which allows for more advanced searches such as using a wildcard (*), searching a specific property (name:Esri), and boolean operators like AND and OR. For more information about Lucene syntax, [see the syntax guide from Apache](https://lucene.apache.org/core/2_9_4/queryparsersyntax.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Query Streaming\n",
    "\n",
    "Query streaming accepts a query string the same way query does, but also allows the additional parameters:\n",
    "- bind_param, which accepts any number of key: value pairs of parameters you would like to include in the query that are created outside of the query. This includes any primitive types as well as geometries, lists, and anonymous objects.\n",
    "- include_provenance, a boolean parameter used to determine whether provenance records will be returned as part of the query results.\n",
    "- as_dict, a boolean parameter used to determine if results are returned as dictionaries or [Entity](https://developers.arcgis.com/python/latest/api-reference/arcgis.graph.html#entity)/[Relationship](https://developers.arcgis.com/python/latest/api-reference/arcgis.graph.html#relationship)/[Path](https://developers.arcgis.com/python/latest/api-reference/arcgis.graph.html#path) objects. It is strongly recommended to use False and True will be removed in the future. The current default is True.\n",
    "\n",
    "Another benefit to using query streaming is the resulting records are not limited to the server's query limits, but rather returned in chunks and presented as a generator which can be used to retrieve all results at once using list() or go through each record one at a time using next()."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# using bind parameters in queries\n",
    "\n",
    "# list example\n",
    "query_list = ['Megan', 'Emma', 'Cameron', 'Noah']\n",
    "results = knowledge_graph.query_streaming(\"MATCH (p:Person) WHERE p.name IN $list RETURN p\", bind_param={\"list\": query_list}, as_dict=False)\n",
    "\n",
    "# anonymous object example\n",
    "query_obj = {\"props\": {\"name\": \"Megan\"}, \"list\": ['Emma', 'Cameron', 'Noah']}\n",
    "results = knowledge_graph.query_streaming(\"MATCH (n:Person)-[:FriendsWith]-(e:Person) WHERE n.name = $object.props.name AND e.name in $object.list RETURN n, e\", bind_param={\"object\": query_obj}, as_dict=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The output of these queries are a generator, so they need to be handled slightly different from the regular query output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# handling results - get all results\n",
    "for result in list(results):\n",
    "    # do something with each result\n",
    "    print(result)\n",
    "\n",
    "# handling results - get results one at a time\n",
    "next(results)\n",
    "\n",
    "# or loop through all results using next\n",
    "while True:\n",
    "    try:\n",
    "        # do something with each result\n",
    "        print(next(results))\n",
    "    except StopIteration:\n",
    "        break"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# including provenance in query results\n",
    "results = knowledge_graph.query_streaming(\"MATCH (n:Provenance) RETURN n LIMIT 1\", include_provenance=True, as_dict=False)\n",
    "list(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The result of this query would look similar to:\n",
    "```\n",
    "[\n",
    "    [\n",
    "        Entity(\n",
    "            properties={\n",
    "                'instanceID': UUID('19a35172-0a3a-4372-91ee-9240aa36f925'), \n",
    "                'propertyName': 'name', \n",
    "                'sourceType': 'String', \n",
    "                'typeName': 'Document', \n",
    "                'globalid': UUID('fc9333d2-27e6-407e-a29f-04946e72a891'), \n",
    "                'sourceName': 'MySourceName', \n",
    "                'source': 'MySource', \n",
    "                'objectid': 1}, \n",
    "            type_name='Provenance', \n",
    "            id=UUID('fc9333d2-27e6-407e-a29f-04946e72a891')\n",
    "        )\n",
    "    ]\n",
    "]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using Results\n",
    "\n",
    "Query responses can get much more complex depending on what is returned from the query. openCypher allows many different types of returns including entities, relationships, properties, anonymous objects, lists, and more. Providing this response as a list of lists guarantees the response can be used once returned.\n",
    "\n",
    "If entities are returned that have a shape (are spatial) it can be useful to view the results of that query in a map. To do so, you can create a data frame from the properties in the results and spatially enable that data frame using the shape field to plot it on a map."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src=></img>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# openCypher query matches all entities (assume all entities returned are spatial for this example, OneType represents returning entities of a single type)\n",
    "query_results = knowledge_graph.query_streaming(\"MATCH (n:OneType) RETURN n\", as_dict=False)\n",
    "query_results = list(query_results)\n",
    "\n",
    "# create a list of all properties of the type to use as columns of our data frame\n",
    "props_list = []\n",
    "for prop in query_results[0][0].properties:\n",
    "    props_list.append(prop)\n",
    "\n",
    "# iterate through the results of the query, writing those results to a list to be used in the data frame\n",
    "results_list = []\n",
    "for result in query_results:\n",
    "    single_result = []\n",
    "    # write each property value to a list\n",
    "    for prop in props_list:\n",
    "        single_result.append(result[0].properties[prop])\n",
    "    # append the list of properties to the data list\n",
    "    results_list.append(single_result)\n",
    "\n",
    "# create a data frame that holds all properties of the observation entities\n",
    "obs_df = pd.DataFrame(data=results_list, columns=props_list)\n",
    "# set the spatial column to shape\n",
    "obs_df.spatial.set_geometry('shape')\n",
    "\n",
    "# create a map of the results\n",
    "new_map = gis.map()\n",
    "new_map.basemap = 'gray-vector'\n",
    "obs_df.spatial.plot(map_widget=new_map, renderer_type='s', marker_size=5, symbol_type='simple', colors=[252,226,5,90], outline_color=[0,0,0,90], line_width=0.5)\n",
    "new_map"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  },
  "vscode": {
   "interpreter": {
    "hash": "d622b5871f1605057390dea3c8b45e995d0d19bef8604acd7f5b2e1066a85139"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
